Analyze Tweets of the French Presidential Election
Posted on Dim 23 septembre 2018 in Data Analysis
French Presidential Election : Tweets (03/02/2017 - 26/02/2017)¶
This Dataset comes from the Twitter API, these data were collected during 10 minutes every hour, during three weeks. The stream filtered tweets by the mention of the name of some candidates. You can find the dataset here : https://www.kaggle.com/jeanmidev/french-presidential-election
Let's go to Analyze these data
In [1]:
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
import sqlite3
import datetime
Converting Sqlite database to Pandas dataframe¶
In [2]:
connection = sqlite3.connect("database.sqlite")
df = pd.read_sql_query("select * from data;", connection)
connection.close()
df.head()
Out[2]:
In [3]:
lines = df.shape[0]
print("Number of Tweets : " + str(lines))
Parsing Dates¶
In [4]:
# Disable Warning 'SettingWithCopyWarning'
pd.options.mode.chained_assignment = None
df_tweets = df[["timestampms","mention_Fillon","mention_Hamon",
"mention_Le Pen","mention_Macron",
"mention_Mélenchon"]]
df_tweets["datetime"] = pd.to_numeric(df_tweets["timestampms"])
df_tweets["datetime"] = pd.to_datetime(df_tweets["datetime"], unit='ms')
df_tweets['year'] = df_tweets.datetime.dt.year
df_tweets['month'] = df_tweets.datetime.dt.month
df_tweets['day'] = df_tweets.datetime.dt.day
df_tweets["datetime"] = pd.to_datetime(df_tweets[['day', 'month', 'year']])
df_tweets = df_tweets.drop(["timestampms"], axis = 1)
df_tweets = df_tweets[["datetime","mention_Fillon","mention_Hamon",
"mention_Le Pen","mention_Macron", "mention_Mélenchon"]]
df_tweets.head()
Out[4]:
Reindexing¶
In [5]:
df_tweets = df_tweets.set_index('datetime')
# Remove the row of the index name
df_tweets.index.name = None
Group by Dates¶
In [6]:
group_tweets = df_tweets.groupby(df_tweets.index).sum()
group_tweets
Out[6]:
In [7]:
group_tweets.plot()
axes = plt.gca()
plt.axvline(x= datetime.datetime(year = 2017, month = 2, day = 6), linestyle="dashed", color="r")
plt.text(datetime.datetime(year = 2017, month = 2, day = 6), y = axes.get_ylim()[1],
s="06/02: Press Conference of Fillon,\nHe apologized to hired his wife", fontsize=9)
plt.axvline(x= datetime.datetime(year = 2017, month = 2, day = 21), linestyle="dashed", color="r")
plt.text(datetime.datetime(year = 2017, month = 2, day = 21), y = axes.get_ylim()[1],
s="21/02: Marine Le Pen refuse \n to wear Muslim headscarf", fontsize=9)
plt.show()